[core] reuse `AttentionMixin` for compatible classes #12463

sayakpaul · 2025-10-10T15:32:47Z

What does this PR do?

Many models use "# Copied from ..." implementations of attn_processors and set_attn_processor. They are basically the same as what we have implemented in

diffusers/src/diffusers/models/attention.py

Line 39 in 693d8a3

class AttentionMixin:

This PR makes those models inherit from AttentionMixin and removes the copied-over implementations.

I decided to leave fuse_qkv_projections and unfuse_qkv_projections out of this PR because some models don't have attention processors implemented in a way that would make this seamless. But the methods removed in this PR should be very harmless.

HuggingFaceDocBuilderDev · 2025-10-11T03:25:43Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

dg845 · 2025-10-14T01:00:08Z

src/diffusers/pipelines/audioldm2/modeling_audioldm2.py

 from ...models.attention_processor import (
    ADDED_KV_ATTENTION_PROCESSORS,
    CROSS_ATTENTION_PROCESSORS,
-    AttentionProcessor,


I see that many of the tests in AudioLDM2PipelineFastTests currently fail in the CI with the following error, e.g.:

FAILED tests/pipelines/audioldm2/test_audioldm2.py::AudioLDM2PipelineFastTests::test_inference_batch_consistent - AttributeError: type object 'ClapConfig' has no attribute 'from_text_audio_configs'

This method is called in the AudioLDM2PipelineFastTests.get_dummy_components:

diffusers/tests/pipelines/audioldm2/test_audioldm2.py

Lines 141 to 145 in fa468c5

text_encoder_config = ClapConfig.from_text_audio_configs(

text_config=text_branch_config,

audio_config=audio_branch_config,

projection_dim=16,

)

It looks like the ClapConfig.from_text_audio_configs method exists in transformers==4.57.0 but has been removed in main. Given that this method will be deprecated, should we replace this call with something like

class AudioLDM2PipelineFastTests(PipelineTesterMixin, unittest.TestCase): ... def get_dummy_components(self): ... text_encoder_config = ClapConfig( text_config=text_branch_config, audio_config=audio_branch_config, projection_dim=16, ) ... ...

?

Similarly, for the following tests fail due to CLIPFeatureExtractor being removed:

FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_download_bin_only_variant_exists_for_model - AttributeError: module transformers has no attribute CLIPFeatureExtractor FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_download_bin_variant_does_not_exist_for_model - AttributeError: module transformers has no attribute CLIPFeatureExtractor FAILED tests/pipelines/test_pipelines.py::PipelineFastTests::test_wrong_model - AttributeError: module transformers has no attribute CLIPFeatureExtractor

Should we replace the calls to CLIPFeatureExtractor with CLIPImageProcessor, or do you think that should be separated into a new PR?

For #12463 (comment) see #12455

nit: Unrelated to this PR. Prefer discussing these separately.

src/diffusers/models/transformers/auraflow_transformer_2d.py

src/diffusers/pipelines/audioldm2/modeling_audioldm2.py

dg845 · 2025-10-14T01:30:09Z

src/diffusers/models/autoencoders/autoencoder_kl.py

-        for name, module in self.named_children():
-            fn_recursive_attn_processor(name, module, processor)
-
    # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor


Perhaps it's out of the scope for this PR, but I see that a lot of models additionally have a set_default_attn_processor method, usually # Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.set_default_attn_processor. Do you think it makes sense to add this method to AttentionMixin?

IMO, not yet since AttentionMixin is fairly agnostic to the model-type but set_default_attn_processor relies on some custom attention processor types. For UNet2DConditionModel, we have:

diffusers/src/diffusers/models/unets/unet_2d_condition.py

Lines 762 to 769 in fa468c5

if all(proc.__class__ in ADDED_KV_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):

processor = AttnAddedKVProcessor()

elif all(proc.__class__ in CROSS_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):

processor = AttnProcessor()

else:

raise ValueError(

f"Cannot call `set_default_attn_processor` when attention processors are of type {next(iter(self.attn_processors.values()))}"

)

However, for AutoencoderKL Temporal Decoder:

diffusers/src/diffusers/models/autoencoders/autoencoder_kl_temporal_decoder.py

Lines 269 to 274 in fa468c5

if all(proc.__class__ in CROSS_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):

processor = AttnProcessor()

else:

raise ValueError(

f"Cannot call `set_default_attn_processor` when attention processors are of type {next(iter(self.attn_processors.values()))}"

)

I'd be down to the refactoring, though. Cc: @DN6

dg845

Looks good to me! I think AuraFlowTransformer2DModel and AudioLDM2UNet2DConditionModel have their attn_processor/set_attn_processor methods deleted but are missing the corresponding change to inherit from AttentionMixin.

sayakpaul · 2025-10-14T03:41:57Z

Thanks for those catches, @dg845. Should have been fixed by now.

dg845 · 2025-10-14T04:11:28Z

LGTM :)

sayakpaul added 2 commits October 10, 2025 20:49

remove attn_processors property

06d4fb5

more

2c9b7c9

sayakpaul added the great-unbloating label Oct 10, 2025

up

f4a2bde

sayakpaul requested review from DN6 and dg845 October 10, 2025 15:39

up more.

c04b016

sayakpaul marked this pull request as draft October 10, 2025 16:18

Merge branch 'main' into reuse-attn-mixin

339f9e5

up

6dbee1e

sayakpaul marked this pull request as ready for review October 11, 2025 03:47

dg845 reviewed Oct 14, 2025

View reviewed changes

src/diffusers/models/transformers/auraflow_transformer_2d.py Show resolved Hide resolved

dg845 reviewed Oct 14, 2025

View reviewed changes

src/diffusers/pipelines/audioldm2/modeling_audioldm2.py Show resolved Hide resolved

dg845 reviewed Oct 14, 2025

View reviewed changes

dg845 approved these changes Oct 14, 2025

View reviewed changes

sayakpaul added 3 commits October 14, 2025 09:03

add AttentionMixin to AuraFlow.

95fcb53

Merge branch 'main' into reuse-attn-mixin

98d2123

up

fabc15d

dg845 mentioned this pull request Oct 14, 2025

[ci] xfail more incorrect transformer imports. #12455

Merged

Merge branch 'main' into reuse-attn-mixin

20dd719

dg845 mentioned this pull request Oct 15, 2025

[tests] fix clapconfig for text backbone in audioldm2 #12490

Merged

sayakpaul added 4 commits October 17, 2025 07:54

Merge branch 'main' into reuse-attn-mixin

3ad6a9a

Merge branch 'main' into reuse-attn-mixin

3a5001e

up

1d4c70b

Merge branch 'main' into reuse-attn-mixin

21cab86

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[core] reuse `AttentionMixin` for compatible classes #12463

[core] reuse `AttentionMixin` for compatible classes #12463

Uh oh!

sayakpaul commented Oct 10, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 11, 2025

Uh oh!

dg845 Oct 14, 2025

Uh oh!

dg845 Oct 14, 2025

Uh oh!

sayakpaul Oct 14, 2025

Uh oh!

Uh oh!

Uh oh!

dg845 Oct 14, 2025

Uh oh!

sayakpaul Oct 14, 2025

Uh oh!

dg845 left a comment

Uh oh!

sayakpaul commented Oct 14, 2025

Uh oh!

dg845 commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	text_encoder_config = ClapConfig.from_text_audio_configs(
	text_config=text_branch_config,
	audio_config=audio_branch_config,
	projection_dim=16,
	)

	if all(proc.__class__ in ADDED_KV_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):
	processor = AttnAddedKVProcessor()
	elif all(proc.__class__ in CROSS_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):
	processor = AttnProcessor()
	else:
	raise ValueError(
	f"Cannot call `set_default_attn_processor` when attention processors are of type {next(iter(self.attn_processors.values()))}"
	)

	if all(proc.__class__ in CROSS_ATTENTION_PROCESSORS for proc in self.attn_processors.values()):
	processor = AttnProcessor()
	else:
	raise ValueError(
	f"Cannot call `set_default_attn_processor` when attention processors are of type {next(iter(self.attn_processors.values()))}"
	)

[core] reuse AttentionMixin for compatible classes #12463

Are you sure you want to change the base?

[core] reuse AttentionMixin for compatible classes #12463

Uh oh!

Conversation

sayakpaul commented Oct 10, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 11, 2025

Uh oh!

dg845 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

dg845 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dg845 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

dg845 left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Oct 14, 2025

Uh oh!

dg845 commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[core] reuse `AttentionMixin` for compatible classes #12463

[core] reuse `AttentionMixin` for compatible classes #12463